A research on identifying the best district in Moscow to buy a flat.

Baseline: The flat is intended for a family with children

Criteria:

1. Data preparation

1.1 Preparation of data on objects - swimming pools

Count missing values in each column

Getting the geographical coordinates for each object

1.2 Preparation of data on objects - hospitals for adults

1.3 Preparation of data on objects - Dental clinic for adults

1.4 Preparation of data on objects - Dental clinic for children

1.5 Preparation of data on objects - hospitals for children

1.6 Preparation of data on objects - School

1.7 Get information about the district through geocoding

1.8 Object clustering

k-means clustering

Affinity Propagation is a graph-based algorithm that assigns each observation to its nearest exemplar. Basically, all the observations “vote” for which other observations they want to be associated with, which results in a partitioning of the whole dataset into a large number of uneven clusters. It’s quite convenient when you can’t specify the number of clusters, and it’s suited for geospatial data as it works well with non-flat geometry.

Self Organizing Maps (SOMs) are quite different as they use deep learning. In fact, A SOM is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional representation of the input space, called a “map” (also referred to as Kohonen layer). Basically, inputs are connected to n x m neurons which form the map, then for every observation is calculated the “winning” neuron (the closest), and neurons are clustered together using the lateral distance. Here, I will try with a 5x5 SOM:

The most filled cluster is number 5. Let's look at its occupancy by category

Let's find out how many unique categories there are in the cluster

We can see that the cluster has all categories of objects

Let's find out what unique administrative areas Moscow has in the cluster

We can see that the cluster includes the following areas:

Finally, let's visualize the resulting clusters

Result

As a result of the research, we found out that the optimal districts to live in Moscow for a family with children are: